Weather Events and their Impacts on Human Health and Economics
Synopsis
The NOAA Storm Database receives Storm Data from the National Weather Service from across the US. This project aims to quantify the impact of various documented storms from an Economic as well as Human perspective. The idea is to compare and contrast various events to answer the following questions:
Across the United States, which types of events are most harmful with respect to population health?
Across the United States, which types of events have the greatest economic consequences?
This work was done as a part of a project towards the completion of the Reproducible Research course in the Data Science Specialization. This knitr generated publication documents all the work done (in R) towards the completion of said project.
Read the full data documentation
Load up all dependencies and set global variables
Download Data if Missing
Load Data onto R
The Data
Table
Columns
| colnames(df) | |
|---|---|
| 1 | STATE__ |
| 2 | BGN_DATE |
| 3 | BGN_TIME |
| 4 | TIME_ZONE |
| 5 | COUNTY |
| 6 | COUNTYNAME |
| 7 | STATE |
| 8 | EVTYPE |
| 9 | BGN_RANGE |
| 10 | BGN_AZI |
| 11 | BGN_LOCATI |
| 12 | END_DATE |
| 13 | END_TIME |
| 14 | COUNTY_END |
| 15 | COUNTYENDN |
| 16 | END_RANGE |
| 17 | END_AZI |
| 18 | END_LOCATI |
| 19 | LENGTH |
| 20 | WIDTH |
| 21 | F |
| 22 | MAG |
| 23 | FATALITIES |
| 24 | INJURIES |
| 25 | PROPDMG |
| 26 | PROPDMGEXP |
| 27 | CROPDMG |
| 28 | CROPDMGEXP |
| 29 | WFO |
| 30 | STATEOFFIC |
| 31 | ZONENAMES |
| 32 | LATITUDE |
| 33 | LONGITUDE |
| 34 | LATITUDE_E |
| 35 | LONGITUDE_ |
| 36 | REMARKS |
| 37 | REFNUM |
Data Processing
Data Subsetting
Since we’re only interested in:
- Event Type
EVTYPE - Fatalities
FATALITIES - Injuries
INJURIES - Damange to Property
PROPDMGEPROPDMGEXP - Damange to Crops
CROPDMGECROPDMGEXP
Quantifying Data
Making the Data Visualization-ready
The dataset used some interesting notation to represent the amount of damage done to their crops and property. For example,
is represented in two columns as,
Let’s have a look at the "xEXP" values present in the dataset.
[1] K M B m + 0 5 6 ? 4 2 3 h 7 H - 1 8
Levels: - ? + 0 1 2 3 4 5 6 7 8 B h H K m M
[1] M K m B ? 0 k 2
Levels: ? 0 2 B k K m M
Corresponding to each unique symbol, we need to assign a numeric value. Let’s create key-value pairs with our assigned exponents as values to facilitate our transition from this current state (shown here) to this.
1. Change all xEXP entries to uppercase.
2. Map property and crop damage alphanumeric exponents to numeric values.
propDmgKey <- c("\"\"" = 10^0,
"-" = 10^0,
"+" = 10^0,
"0" = 10^0,
"1" = 10^1,
"2" = 10^2,
"3" = 10^3,
"4" = 10^4,
"5" = 10^5,
"6" = 10^6,
"7" = 10^7,
"8" = 10^8,
"9" = 10^9,
"H" = 10^2,
"K" = 10^3,
"M" = 10^6,
"B" = 10^9)
cropDmgKey <- c("\"\"" = 10^0,
"?" = 10^0,
"0" = 10^0,
"K" = 10^3,
"M" = 10^6,
"B" = 10^9)3. Replace the values in PROPDMGEXP with their corresponding numeric values.
dmg[, PROPDMGEXP := propDmgKey[as.character(dmg[,PROPDMGEXP])]]
dmg[is.na(PROPDMGEXP), PROPDMGEXP := 10^0 ]
dmg[, CROPDMGEXP := cropDmgKey[as.character(dmg[,CROPDMGEXP])] ]
dmg[is.na(CROPDMGEXP), CROPDMGEXP := 10^0 ]NOTE: Using mutate for this task doesn’t work as intended for some reason
4. Use mutate to create columns for PropertyDamage, CropDamage and TotalDamage and get rid of the raw column data
Results
Summary of Data
Let’s first summarize the data to find the total Economic and Manpower Damage of these Weather events
dmg %>%
group_by(EVTYPE) %>%
summarize(PropertyDamage = sum(PropertyDamage),
CropDamage = sum(CropDamage),
TotalDamage = sum(TotalDamage),
Injuries = sum(INJURIES),
Fatalities = sum(FATALITIES))